Evaluating Stemmers and Retrieval Fusion Approaches for Hindi: UNT at FIRE 2010
نویسندگان
چکیده
This paper describes the experiments conducted by the University of North Texas team as part of our participation in the Forum for Information Retrieval (FIRE). We concentrated on comparing the results using two morphological stemmers (YASS and Morfessor), studying the effect of using a part of speech tagger (Combined Random Fields) to weight the contribution of words with noun phrases, and to use a data fusion approach to improve performance of the system by combining these methods. We conducted our study using Hindi and explore the cross-language retrieval performance from English to Hindi using Google translations. Our results show that using the YASS stemmer yields a small increase in retrieval performance. Fusion of results also showed to be effective and improved results 5% in our experiments.
منابع مشابه
DCU@FIRE-2012: Rule-based Stemmers for Bengali and Hindi
For the participation of Dublin City University (DCU) in the FIRE-2012 Morpheme Extraction Task (MET), we investigated a rule based stemming approaches for Bengali and Hindi IR. The MET task itself is an attempt to obtain a fair and direct comparison between various stemming approaches measured by comparing the retrieval effectiveness obtained by each on the same dataset. Linguistic knowledge w...
متن کاملFIRE-2008 at Maryland: English-Hindi CLIR
In this year's Forum for Information Retrieval Evaluation (FIRE), the University of Maryland participated in the Ad-hoc task cross-language document retrieval task, with English queries and Hindi documents. The experiments focused on evaluating the effectiveness of a “meaning matching” approach based on translation probabilities. The FIRE Hindi test collection provides the first opportunity to ...
متن کاملDeveloping Morphological Analysers for South Asian Languages: Experimenting with the Hindi and Gujarati Languages
A considerable amount of work has been put into development of stemmers and morphological analysers. The majority of these approaches use hand-crafted suffix-replacement rules but a few try to discover such rules from corpora. While most of the approaches remove or replace suffixes, there are examples of derivational stemmers which are based on prefixes as well. In this paper we present a rule-...
متن کاملDCU@FIRE2012: Monolingual and Crosslingual SMS-based FAQ Retrieval
This paper presents results for DCU’s second participation in the SMS-based FAQ Retrieval task at FIRE. For FIRE 2012, we submitted runs for the monolingual English and Hindi and the crosslingual English to Hindi subtasks. Compared to our experiments for FIRE 2011, our system was simplified by using a single retrieval engine (instead of three) and using a single approach for detecting out-of-do...
متن کاملDCU@FIRE2010: Term Conflation, Blind Relevance Feedback, and Cross-Language IR with Manual and Automatic Query Translation
For the first participation of Dublin City University (DCU) in the FIRE 2010 evaluation campaign, information retrieval (IR) experiments on English, Bengali, Hindi, and Marathi documents were performed to investigate term conflation (different stemming approaches and indexing word prefixes), blind relevance feedback, and manual and automatic query translation. The experiments are based on BM25 ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010